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_ This invention r.l« es to an pressing 

apparatus a„ d method, in p « tloulari tftis lnventiQn 
relates to an Unage processing apparatus and method for 
use in the creation of a three-dimensionai 001 . p „ ter ^ 
of a real-lif. object from two-dimensional image data 
representing different views of the object to be 
Modelled. Generally, this ^ ga ^ ^ ^ & 

°f - tm images or video f „m., recorded at different 
relative orientations or positions of ""^Tob^and the 
recording camera - 

In order to create the three-dimensional computer 
—el, a three-dimensional object surface is generated 
from the set of image data and data defining the relative 
positions or orientations at which each of the images was 

recorded . 

one Known way of generating a three-dimensional 
object surface from the image data is to use a technique 
known as "voxel carving- „ hich ia described in detail tn 
a paper entitled -Rapid Octree Construction from Im age 
sequences- by Richard Szelisfci published i„ CTO1 P = z„ag e 
understanding Vol. 5 8 . »o. 1, Ju i y 1993 M pages 
in this method, a number of images of the object whose 
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thr-ee- dimensional surface is to be modelled are produced 
such that each image shows a silhouette of the object 
surrounded by a background. The relative orientation 
between the object and the camera position at which each 
image was taken together with characteristics of the 
camera (such as focal length and the size of the image 
aperture) are used to determine the relative location and 
orientation of each image relative to a model volume or 
space which is divided into subsidiary volume elements or 
voxels to form a voxel space. Each non-occluded voxel is 
then projected into the images- Voxels that project into 
background portions of the images are removed from the 
voxel space* This procedure continues until no 

background voxels remain. At this stage , the surface 
voxels of the voxel space should define the outline or 
silhouette of the object shown in the images. 

Although the above-described technique works 
satisfactorily where there is a well-defined boundary 
between the object and the background in the image , 
difficulties can arise where the boundary between the 
object and the background is ill-defined or difficult to 
distinguish because, for example, there is insufficient 
distinction in colour or brightness between the 
background and object pixels in the images. In practice, 
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the above-described technxgue works w*n 

4 rics wel1 only when the 

conditions under which the images are A - 

images are acquired are well^ 
controlled so that • 

the " 15 ■ distinguishable 
boundary between the edge o£ 

eage or t hB object and the 

background in eaoh linage. 

Another teobn^e for generating . ^^^^ 
o^ect e„« aC e frM ^ that ^ ^ ^ 

*b le to a . pa „te e.oh ^ lnto ^ 

1. de.orXbed in the ^ !lty of Roa ^ 
computer sc.eu.ee T e =hnieal Report Mo ^ ^ ^ 

1998 entitled "What n« m «w 

What do N Photographs Tall Us About 3D 

Shape?" and a Universi>v ~* « ^ 

" lty ° f R ° ohest « Computer sciences 
Technical Keport No . 692 of May ^ ^ ^ 

° f Shape by Spaee c *~^ — * Kiriakos N 

-tulafcos and stephen M . ^ ^ technigue des ^ e - 

" thSSe ^ — iS — - -P-e carving- or „ VQxel 
COlOUriDg "- — on the fact ^ ^ 

viewpoint of each ^ ge or photog raph is kftown ±n a 
common 3D world refer^n. ^ 

rSnCe f rame and scene radiance 

follows a known 

known, locally computable radiance function, 
that xs so that effects - 

tS SUCh as Shadows ' transparencies 

and inter-reflections can be ignored. i„ this * h • 

^ xn tins technique, 

the three-dimensional model space is aoai „ . 

apace is again divided into 
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voxels, a non-occluded voxel is then projected into each 
image in turn. The colour of the patch of pixels to 
which the voxel projects is determined for each image. 
If the colours are different or not consistent, then it 
is determined that that voxel does not form part of the 
3D object's surface and that voxel is removed or 
discarded. Each non-occluded voxel is visited in turn 
and the process is repeated until the remaining non- 
occluded voxels are all photo or colour consistent. 

The initial voxel space needs to be defined relative 
to the object. if the initial voxel space is too large, 
then a large number of computations and a large number of 
voxels will need to be removed until the final 3D object 
surface is generated. 

One way to ensure that the initial voxel space is 
not too large is described in the aforementioned 
University of Rochester Computer Sciences Technical 
Reports. This method involves first identifying 

background pixels in each image and then restricting the 
voxel space to, for each image, a cone defined by the 
position and/or orientation at which the image was taken 
of the object and the identified non-background pixels in 
the image. Thus, in this method the initial voxel space 
is defined as the intersection of cones each projecting 
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from the effective focal point of a corresponding image 
through the boundary or silhouette of the object in that 
image. This technique for defining the initial voxel 
volume therefore requires that the boundary be identified 
between the object and the background pixels i„ each 
image as described in the aforementioned paper by Richard 
Szeliski. Where the boundary between the object and the 
background is well-defined and precise, then this 
technique should not cause any problems in the generation 
of the three-dimensional object surface, although it will 
increase the amount of computation required to arrive at 
the three-dimensional object surface. However, where the 
boundary between the object and the background in each 
image is not well-defined and identifiable, then errors 
may arise in definition of that boundary so that, for 
example, the initial voxel space does not include all of 
the voxels that project into the object in the images. 
This can cause severe problems in the subsequent 
generation of the three-dimensional object surface. The 
reason for this is that, if the boundary erroneously 
excludes object voxels, then the relative relationship 
between voxels in the initial voxel space will be 
incorrect and voxels that should have been occluded by 
other voxels may not be occluded, and vice versa. Where 
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a voxel that should have been occluded is not occluded, 
then the subsequent colour or photoconsistency check 
described above will almost certainly result in that 
voxel being determined to be photo-inconsistent, so 
resulting in the erroneous removal of that voxel. This 
erroneous voxel removal will compound the error discussed 
above and may itself result in one or more other voxels 
being erroneously removed and so on- Indeed, this 
initial error in definition of the voxel space may lead 
to a catastrophic failure in that so many voxels may be 
erroneously removed that it is not possible to generate 
the 3D object's surface. 

The above described voxel colouring or space carving 
technique also relies on the individual pixel patches 
being formed of pixels of the same or very similar 
colours. If there is a variation in colour between the 
pixels of a pixel patch, then the photoconsistency check 
may not provide accurate results and it is possible that 
a voxel that actually forms part of the required 3D 
object surface (an 'object voxel') may be erroneously 
removed- The erroneous removal of that voxel may have 
knock-on effects so that further object voxels are 
erroneously removed. This erroneous removal may, in 
turn, cause erroneous removal of further voxels. The 
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erroneous removal of a single voxel may, in certain 
cases, effectively cause a cascade or chain reaction and 
may cause the voxel colouring process to fail, that is it 
may be impossible to provide a 3D model of the object 
surface because too many (possibly even all) of the 
object voxels may be removed. 

in the above described space carving or voxel 
colouring process, each voxel in turn is projected into 
each of the images in which it is visible. Because of 
the computational power and time required, it is 
generally not possible to carry out this process using 
more than 20-30 images. Depending upon the nature of the 
object whose three dimensional surface is to be modelled, 
this number of images may be insufficient to provide a 
realistic 3D model of the object surface. 

in this known voxel colouring technique, if a voxel 
that actually forms part of the required 3D object 
surface is erroneously removed (because, for example, of 
shadows or highlights affecting the colours in the 
images), then the removal of that voxel may have knock-on 
effects so that further object voxels are erroneously 
removed. This erroneous removal may, in turn, cause 
erroneous removal of further voxels. The erroneous 
removal of a single voxel may, in certain cases, 
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effectively cause a cascade or chain reaction and may 
cause the voxel colouring process to fail, that is it may 
be impossible to provide a 3D model of the object surface 
because too many (possibly even all) of the object voxels 
may be removed. 

It is an aim of the present invention to provide 
image processing apparatus and a method of operating such 
image processing apparatus that enable the initial voxel 
space for a voxel colouring or space carving technique to 
be defined so as to avoid excessive computation whilst 
also avoiding or at least reducing the possibility of 
erroneous voxel removal . 

In one aspect, the present invention provides image 
processing apparatus having processing means operable to 
define an initial voxel space from which a three- 
dimensional object surface is to be generated by defining 
the initial voxel space as the volume bounded by the 
intersection of a number of cones with each cone having 
its apex at a respective one of the focal points and 
having its surface defined by lines extending from the 
focal point through the boundary of the corresponding 
camera aperture or imaging area for a respective one of 
the images from which the three-dimensional object 
surface is to be generated. This avoids an arbitrary 



definition of the initial voxel space and enables the 
initial voxel space to be precisely defined while 
ensuring that all object voxels (that is voxels that 
project into the object in the images) are within the 
initial voxel space so as to avoid or at least reduce the 
possibility of catastrophic failure mentioned above. 

It is an aim of the present invention to provide 
image processing apparatus and a method of operating such 
image processing apparatus that avoids or at least 
mitigates or reduces the possibility of erroneous removal 
of a voxel. 

In one aspect, the present invention provides image 
processing apparatus having processing means operable to 
test whether a voxel forms part of a 3D object, the 
processing means being arranged, where it cannot 
determine whether a voxel forms part of the 3D object 
surface, to sub-divide that voxel into subsidiary voxels 
and to repeat the test for each of the subsidiary voxels. 
If desired, this sub-division may be continued until each 
subsidiary voxel projects only into a single pixel in 
each image. such apparatus embodying the present 
invention should enable a more accurate determination of 
the 3D object surface even where there is significant 
colour variation within a pixel patch into which a voxel 
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It is an aim of the present invention to provide 
image proc essing apparatus and a method of operating such 
Wge processing apparatus that enable the number of 
images of an object used during a voxel colouring process 
to be increased so as to enable a more precise 3D object 
surface to be generated without excessively increasing 
the amount of computational power and time required for 
the process , 

It is an aim of the present invention to provide 
image processing apparatus and a method of operating such 
image processing image apparatus that enable recovery of 
* voxel colouring process f r om potential catastrophic 
failure without necessarily having to completely restart 
the voxel colouring process. 

in one aspect, the present invention provides i.nage 
processing apparatus having processing means operable to 
determine, using a first set of image data, th e 
Photoconsistency of non-occluded voxels of an initial 
voxel space to provide a first 3D object surface and then 
to refine that f irst 3D object suxface fay checking ^ 
photoconsistency of non-occluded voxels of that f irst 3D 
object surface against image data for one or more further 
images . 
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In one aspect, the present invention provides image 
processing apparatus having processing means operable to 
provide a 3D model of a surface of a 3D object by 
checking the photoconsistency of non-occluded voxels of 
an initial voxel space for a first set of image data, 
storing the results of that check as a first 3D object 
surface and then refining the first 3D object surface by 
checking the photoconsistency of non-occluded voxels 
using one or more further images of the object and one or 
more of the images used to produce the first 3D object 
surface. 

In either of the above described aspects, the 
processing means may be operable to repeat the refinement 
one or more further times adding one or more further 
images each time . 

In one aspect, the present invention provides image 
processing apparatus having processing means operable to 
provide a model of a 3D object surface by checking the 
photoconsistency of voxels of a voxel space using images 
of the object, and then to repeat that process using 
further images so as to further refine the 3D object 
surface model until a final 3D object surface model is 
produced, whereby the processing means is operable to use 
at least one additional image in each photoconsistency 
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check and to store the 3D object surface generated by at 
least one of the previous photoconsistency checks before 
carrying out the next photoconsistency check so that, if 
the next photoconsistency check results in the erroneous 
removal of one or more object voxels, the processing 
means can return to the results of the stored previous 
photoconsistency check. 

In one aspect, the present invention provides image 
processing apparatus having processing means operable to 
provide a model of a 3D object surface by checking the 
photoconsistency of voxels of a voxel space using images 
of the object, and then to repeat that process using 
further images so as to further refine the 3D object 
surface model until a final 3D object surface model is 
produced, the processing means also being operable to 
store the image data for one or more of the images 
previously used for a photoconsistency check and to 
discard the oldest of the stored images and replace it 
with the newest used image each time the photoconsistency 
check is repeated so that the processing means is 
operable to store a running set of images thereby 
enabling a photoconsistency check to be carried out using 
the stored images together with a newly added image so 
that the processing means has available the raw image 
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data for each of the stored images and not simply the 3D 
object surface that resulted from the previous 
photoconsistency check. This should enable, for example, 
restoration of inadvertently removed voxels when the 
addition of new image data causes the processing means to 
conclude that a voxel is in fact an object voxel when a 
previous photoconsistency check determined that that 
voxel was inconsistent. 



Embodiments of the present invention will now be 
described, by way of example only, with reference to the 
accompanying drawings, in which: 

Figure 1 shows schematically the components of a 
modular system in which the present invention may be 
15 embodied; 

Figure 2 shows a block diagram of processing 
apparatus for putting into effect one or more of the 
modules shown in Figure 1; 

Figure 3 shows a top level flowchart for 
illustrating generation of a three-dimensional object 
surface using the processing apparatus shown in Figure 2 ; 

Figure 4 shows a flowchart for illustrating the step 
shown in Figure 3 of defining an initial voxel space, - 

Figure 5 shows a flowchart illustrating in greater 
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detail -the step of determining the viewing cones for each 
camera position shown in Figure 4; 

Figure 6 shows a flowchart illustrating in greater 
detail the step of determining the viewing cone for each 
camera position shown in Figure 4; 

Figure 7 shows in greater detail the step shown in 
Figure 4 of defining voxels within the initial voxel 
space ; 

Figures 8 and 9 are schematic representations for 
illustrating a camera arrangement and the associated 
initial voxel space with Figure 9 being a side 
elevational view (with the front camera omitted in the 
interests of clarity) and Figure 8 showing a cross- 
sectional view taken along the lines VIII-VIII in 
15 Figure 9; 

Figures 10a and 10b show diagrammatic perspective 
views to illustrate division of two different initial 
voxel spaces into voxels; 

Figure 11 shows a part-sectional perspective view of 
part of the voxel space shown in Figure 10a so as to 
illustrate more clearly the division of the voxel space 
into voxels; 

Figure 12 shows a flowchart for illustrating in 
greater detail a method of carrying out the step shown 
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Figure 3 of determining the voxels defining the three- 
dimensional object surface; 

Figure 13 shows schematically the projection of a 
voxel onto part of an image; 

Figures 14a to 14d show flowcharts illustrating in 
greater detail steps carried out in a method of carrying 
out step S21 of Figure 12; 

Figure 15 shows a flowchart for illustrating one way 
of carrying out the further processing step shown in 
Figure 14a; 

Figure 16 shows a diagrammatic representation of a 
portion of the part of the image shown diagrammatical ly 
in Figure 13 to illustrate a pixel patch formed by 
projection of a subsidiary voxel into the image ; 

Figure 17 shows a flowchart for illustrating another 
way of carrying out the additional processing step shown 
in Figure 12a; 

Figure 18 shows a flowchart for illustrating another 
method of carrying out a voxel colouring process; 

Figure 19 illustrates diagrammatical l y one form of 
colour space; 

Figure 20 illustrates a plane of the colour space 
shown in Figure 19; 

Figure 21 shows a flowchart illustrating in greater 
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detail another way of carrying out the step S2 in Figure 
3 of determining the voxels defining the 3D object 
surface ; 

Figures 22a and 22b show a flowchart illustrating in 
greater detail the step of performing a voxel colouring 
process using a current voxel space and a new image shown 
in Figure 21; 

Figure 23 shows a flowchart illustrating another way 
of carrying out step S2 in Figure 3; 

Figures 24a and 24b show a flowchart illustrating in 
greater detail the step of performing a voxel colouring 
process using a current voxel space and a new set of 
images shown in Figure 23; and 

Figure 25 shows a very schematic view similar to 
Figure 8 for use in explaining the effect of adding 
further images . 

Figure 1 schematically shows the components of a 
modular system in which the present invention may be 
embodied. 

These components can be effected as processor- 
implemented instructions, hardware or a combination 
thereof . 

Referring to Figure 1, the components are arranged 
to process data defining images (still or moving) of one 
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or more objects in order to generate data defining a 
three-dimensional computer model of the object(s). 

The input image data may be received in a variety of 
ways, such as directly from one or more digital cameras, 
via a storage device such as a disk or CD ROM, by 
digitisation of photographs using a scanner, or by 
downloading image data from a database, for example via 
a datalink such as the Internet, etc. 

The generated 3D model data may be used to: display 
an image of the object (s) fr0 m a desired viewing 
position; control manufacturing equipment to manufacture 
a model of the object (s), for example by controlling 
cutting apparatus to cut material to the appropriate 
dimensions; perform processing to recognise the 
object(s), for example by comparing it to data stored in 
a database; carry out processing to measure the 
object (s), for example by taking absolute measurements to 
record the size of the object(s), or by comparing the 
model with models of the object(s) previously generated 
to determine changes therebetween; carry out processing 
so as to control a robot to navigate around the 
object(s); store information in a geographic information 
system (CIS) or other topographic database; or transmit 
the object data representing the model to a remote 
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processing device forany such processing, either on a 
storage device or as a signal (for example, the data may 
be transmitted in virtual reality modelling language 
(VRML) format over the Internet, enabling it to be 
processed by a WWW browser); etc. 

The feature detection and matching module 2 is 
arranged to receive image data recorded by a still camera 
from different positions relative to the object (s) (the 
different positions being achieved by moving the camera 
and/or the object( S)) . The received data is then 
processed in order to match features within the different 
images (that is, to identify points in the images which 
correspond to the same physical point on the object(s)). 

The feature detection and tracking module 4 is 
arranged to receive image data recorded by a vid eo camera 
as the relative positions of the camera and object (s) are 
changed (by moving the video camera and/or the 
object (s),. as in the feature detection and matching 
module 2, the feature detection and tracking module 4 
detects features, such as corners, in the images. 
However, the feature detection and tracking module 4 then 
tracks the detected features between frames of image data 
in order to determine the positions of the features in 
other images. 
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The camera position calculation module 6 is arranged 
to use the features matched across images by the feature 
detection and matching module 2 or the feature detection 
and tracking module 4 to calculate the transformation 
between the camera positions at which the images were 
recorded and hence determine the orientation and position 
of the camera focal plane when each image was recorded. 

The feature detection and matching module 2 and the 
camera position calculation module 6 may be arranged to 
perform processing in an iterative manner. That is, 
using camera positions and orientations calculated by the 
camera position calculation module 6, the feature 
detection and matching module 2 may detect and match 
further features in the images using epipolar geometry in 
a conventional manner, and the further matched features 
may then be used by the camera position calculation 
module 6 to recalculate the camera positions and 
orientations. 

If the positions at which the images were recorded 
are already known, then, „ indicated by arrow 8 in 
Figure 1, the image data need not be processed by the 
feature detection and matching module 2, the feature 
detection and tracking module 4, or the camera position 
calculation module 6. For example, the images may be 
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recorded by — » — « - °-" 5 °" * " lib "" d 
ri g arrayed to hold 1- >«•» poshes 

relative to the object(s). 

Alternatively, it i- P—"*- « 1 " * h " 

positions of a Pl-«1«T -« -— ~ *° 
o«e=t<s> ^ addin* calibration « 0ble=t(S> 

and calculating the Potions of the cameras 
positions of th. calibration makers in imao.s «~— 
by the cameras. Th. calibration marKers «T «^*« 
patterns of li 9 ht projects onto the *,«*<«. «— 
calibration module 10 is therefore provided to receive 

. olurality of cameras at fixed positions 
image data from a plurality o 

it together with calibration markers, 

showing the object (S) ^ c 

ana to process the data to deterge the positions of th. 
cameras. A preferred method of calculate the positions 
of t he cameras (and also internal para-tars of each 
camera, such as the focal l.»,th etc, is described in a 
paper entitled "Calibrating and 30 «cd.llih g with a 

c ,«-««. bv Wiles and Davison published in 
Multi-Camera System by wiJ.e» 

UllHi View Modelling Analysis of 
19 99 IEEE Workshop on Multi-Vxew n 

Visual Scenes, ISBN 0769501109. 

The 3D ob le =t surface module 12 - 

ranged to receive image data showing the o bj ect (S > ~* 
data defining the positions at which th. images were 
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recorded, and to process the data tQ generate a 3Q 
computer model representing the actual surf ace(s, of the 
object(s), such as a polygon mesh model. 

The texture data generation module 14 is arranged to 
generate texture data for rendering onto the surface 
model produced by the 3D object surface generation 
module 12. The texture data is generated from the input 
image data showing the object(s). 

Techniques that can be used to perform the 
processing in the modules shown in Figure l are described 
in EP-A-0898245, EP-A-0901105 , pending US a P pl ications 
09/129O77, 09/129079 and 09/129080, the full contents of 
which are incorporated herein by cross-reference, and 
also the attached Annex- 

The present invention may be embodied in particular 
as part of the 3D object surface generation module 12. 

Figure 2 shows a block diagram of processing 
apparatus 20. 

The processing apparatus 20 comprises a main 
processing unit 21 having a central processing unit (CPU) 
22 with associated memory (RO M and/or RAM) 2 2a. The 
CPU 22 is coupled to an input device 23 (which may 
consist, in known manner, of a keyboard and a pointing 
device such as a mouse,, a display 24, a mass-storage 
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system 25 such as a hard disc drive, and a removable disc 
drive (rdd, 26 for receiving a removable disc (R D ) 27. 
The removable disc drive 26 may be arranged to receive 
removabie disc 2 7 such as a floppy disc, a CD ROM or a 
writable CD ROM. The CPU 22 may also be coupled to an 
interface I for receiving signals S carrying processor 
implementable instructions and/or data. The interface 
may comprise, for example, a connection to a network such 
as the internet, an intranet, a LAN (local area network) 
or a WAN (wide area network) or may comprise a data link 
to another processing apparatus, for example an infrared 



link. 



The processing apparatus 2 0 is configured to form 
the 3D object surface generation module 12 shown in 
Figure 1 by means of processor implementable instructions 
and/ or da1 , a stored ±n the ^ ProcesSQr 

implementable instructions and/or data stored in the 
memory may also configure the apparatus to form any one 
or more of the other modules shown in Figure 1. These 
processor implementable instructions and/or data may be 
prestored in the memory 22a or may be supplied to the 
main processing unit 2l as a signal s via the interface 
I or on a removable disc 27 or may be supplied to the 
main processing unit 21 by any combination of the 
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techniques . 
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3D object surface data resulting from use of th 
processing apparatus 20 in a m an ner to be described below 
may be stored in the mass-storage system 25 and may also 
be displayed on the display 24. The 3D object surface 
data may a i so be downloacied tQ a removable disc 2? ^ 
supplied as a signal s via the interface ^ Tne 3d 
object surface data may be subsequently processed by the 
processing apparatus 20 when configured to operate as the 
texture data generation module 14 shown in Figure 1. 
Such further processing may, however, be carried out by 
another processing apparatus which receives the 3 D object 
surface data via, for example, a removable disc 27 or as 
a signal S from the processing apparatus 20 shown in 
Figure 2 • 

operation of the processing apparatus 20 shown in 
Figure 2 to generate a three-dimensional object surface 
will now be described. 

The data necessary to enable generation of the 3D 
object surface will have been obtained as described above 
with reference to Figure 1 and will already be stored in 
the mass-storage system 25 for access by the CPU 22. 
This data includes image data for each of the images of 
the object to be used to generate the 3d object surfa 
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** C h of the images is stQred ±n the mass _ stQrage 
system 25 as an array of p±xel values ^ ^ ^ ^ 
each image being ailocated a number ±dentify . ng 
colour of that Pixel . Typically , fQr g „ y shades 

nmt ^ W±11 be b6 ^ een 0 "5 giving a possibiUty ot 

256 grey snaaes while for full coiQur numfaer ^ ^ 

between 0 and 255 for each primary colour (gener . ally red , 
green and blue). 

The im age data is aceompanied fay 

rePrfiSenting the r6lati - Position and orientation with 
resp Bct to the object of the camera po S i tions at whicn 
the i mag e was obtained and internal parameters of the 
oamera or cameras such as the focal length and the 
tensions of the imaging area or viewing window of the 
cameras,. This camera data may be obtained ^ ^ 

manner described above with reference to modules 2 and 6 
in Figure 1 or modules 4 and 6 in Figure lf or module 1Q 
in Figure 1 or, as indicated by the arrow 8 in Figure i, 
the position and relative orientation data may be 
obtained directly from known camera positions. The 
oamera internal parameters may be prestored i n the 
apparatus, input by the user using the input device 23 or 
determined as described in the aforementioned paper by 
Wiles and Davison ( ISBN 0769501109). 
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Figure 3 shows a top level. flowchart for 
illustrating generation of the 3D object surface from 
this data. At step SI, an initial voxel space containing 
the required 3D object surface is defined by the CPU 22. 

Once the initial voxel space has been defined, then 
the photoconsistency of each non-occluded voxel is 
checked in turn to determine the voxels defining the 3D 
object surface at step S2. The defined 3D object surface 
is then stored at step S3. 

Step SI of Figure 3 will now be described in more 
detail with reference to the flowchart shown in Figure 4. 
At step sil, the CPU 22 accesses the camera internal 
parameters and position data stored in the mass-storage 
system 25 (Figure 2). At step S 12, the CPU 22 
determines, using the camera internal parameters and 
positions, the viewing cone for each camera position. 

At step S13 the CPU 22 determines the volume bounded 
by the intersection of the viewing cones of the camera 
positions, at step S14 the CPU 22 sets the bounded volume 
as the initial voxel volume and at step S15 the CPU 22 
sub-divides the initial voxel space into cubic or right- 
parallelopipedal voxels arranged in a cubic, close-packed 
array so as to form the initial voxel space. 

Figure 5 shows a flowchart illustrating in greater 
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detail step S12 of Figure 4. At step S121, the CPU 22 
determines from the data stored in the mass-storage 
system 25, the focal point of the camera for the camera 
position for a first one of the stored images. A t step 
S122, the CPU 22 determines from the camera data stored 
in the mass-storage system the side lengths and location 
in three dimensional space of the imaging area relative 
to the focal point. At step S123, the CPU defines 
rectilinear straight lines projecting from the determined 
focal point and each passing through and projecting 
beyond a respective different one of the corners of the 
imaging area. At step S124, the CPU stores the volume 
bounded by the straight lines as the viewing cone by 
storing the relative orientations of the straight lines. 
At step S125 r the CPU 22 determines whether the viewing 
cone for another camera position needs to be determined. 
If the answer is yes, then the CPU 22 repeats steps S121 
to S12 5 until the answer at step S12S is no when the CPU 
22 proceeds to step S13 in Figure 4. 

Figure 6 shows a flowchart illustrating in greater 
detail step S13 shown in Figure 4. At step S131, the CPU 
22 selects the stored data representing the viewing cones 
of first and second ones of the camera positions. At 
step S132, the CPU 22 determines the planes of 
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intersection between the first and second camera viewing 
cones using the stored data representing the straight 
lines defining the viewing cones. At step S133, the 
CPU 22 stores the volume bounded by the planes of 
intersection of the viewing cones as an estimated volume. 
At step S134, the CPU checks to see whether there is 
another camera position whose viewing cone intersection 
has not yet been determined. If the answer at step si 34 
is yes, then the CPU determines at step S13S the planes 
of intersection between the current estimated volume and 
the next camera position viewing cone and then stores the 
volume bounded by those planes of intersection as the new 
estimated volume at step S133. Steps S134, S135 and S133 
are repeated until the answer at step S134 is no at which 
point the CPU stores the estimated volume as the volume 
bounded by the camera viewing cones at step S136 and 
returns to step S14 in Figure 4 at which the bounded 
volume is set as the initial voxel volume. 

Figure 7 shows a flowchart illustrating in greater 
detail step S15 of Figure 4. At step S151, the CPU 22 
divides a volume or space containing the initial voxel 
space into cubic or right-parallelopipedal voxels 
arranged in a close-packed array. The CPU 22 then 
discards at step S152 any voxels lying outside the 



10 



28 2641050 

boundary of the determined initial voxel volume. At step 
S153, the CPU 22 discards any voxels through which the 
boundary of the initial voxel volume passes and at step 
S154 stores the remaining voxels as the initial voxel 
space. 

Figures 8 and 9 show one example of a camera 
position arrangement to illustrate an example of an 
initial voxel space derived in the manner described 
above - 

In the example shown in Figures 8 and 9, the camera 
position arrangement consists of four camera positions A 
to D arranged in a single plane (the plane of the paper 
of Figure 8 in this example) and spaced apart by an angle 
of 90° relative to one another about a central axis x 
15 indicated by the dotted line in Figure 9. 

Each of the camera positions has a focal point F A to 
F D (in this example the focal lengths are all the same 
although this need not necessarily be the case) and an 
imaging area I A to I u (see Figure 10) defined by the 
camera aperture in the case of a camera using 
photographic film or by the CCD sensing area in the case 
of a CCD camera. Again, in this example, the imaging 
areas I of all four cameras are the same. 

Figures 8 to 10 show by way of the dashed lines the 
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viewing cones vc A , VC B , vc c and VC 0 of each of the camera 
positions A to D. Figures 8 and 9 also show the relative 
locations of the images IM A , IM B , ri^ and IM D produced at 
the camera positions A to D. 

The volume bounded by the intersection of the 
viewing cones of the camera positions A to D is 
identified by the reference sign VB in Figures 8 and 9. 

As illustrated schematically in Figure 8, the voxel 
space VS defined by the CPU 22 in the manner described 
above with reference to Figures 3 to 7 lies wholly within 
the volume VB and consists of a close-packed cubic array 
of cubic (or right parallelopipedal) voxels V each of 
which lies wholly within the bounding volume VB. 

Figure 10a shows a perspective view for the camera 
arrangement shown in Figures 8 and 9 to illustrate the 
overall appearance of the voxel space vs in relation to 
the 3D surface 40 to be generated, in this case a bust of 
a man. it will, of course, be appreciated that 
Figure 10a necessarily shows the voxels v very 
schematically and, because of the very small size of the 
voxels V, is not accurate. Figure 11 shows a part- 
sectional perspective view of part P of the voxel space 
VS shown in Figure 10a to illustrate more clearly how the 
boundary of the voxel space VS is made up of a step-like 
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arrangement of voxels V. 

it will, of course, be appreciated that the shape of 
the bound volume VB defined by the intersection of the 
camera viewing cones will depend upon the relative 
orientations and numbers of the cameras and also upon the 
individual viewing cones which will in turn depend upon 
the focal points or positions of the cameras and the size 
and shapes of their imaging areas. To illustrate this. 
Figure 10b shows very schematically the initial voxel 
space VS- where the camera arrangement comprises four 
cameras A - to D' arranged above and looking down on the 
object and four cameras A" to D" arranged below and 
looking up at the object with, as in the example 
described above, the cameras being spaced at 90° 
intervals around the object. The periphery of the voxel 
space VS itself is, of course, determined by the boundary 
of the volume VB and the size of the voxels relative to 
the size of the bound volume VB. The size of the voxels, 
and thus the resolution to which the 3D object surface 
20 can be generated will depend upon the available 

computational capacity of the CPU 22 and the time 
available for the computation of the 3D object surface. 
Typically, the voxel space VS may consist of 100,000 
voxels or up to several millions of voxels. 
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The method described above of defining the initial 
voxel volume by the intersection of the viewing cones of 
the camera positions avoids the disadvantages discussed 
above of defining the initial voxel volume using the 
silhouette or boundary of the object whose surface is to 
be generated and should also reduce the number of 
computations required to achieve the final 3D object 
surface in contrast to arrangements where the initial 
voxel space is defined arbitrarily so as to be 
sufficiently large to enclose the 3D object whose surface 
is to be generated. 

A method of generating the 3D object surface 

starting from the initial voxel space VS will now be 

described with reference to Figures 10a, 12, 14a to d, 13 
15 and 15 . 

Figure 12 shows a top level flow chart for this 
method. At step S21, the CPU 22 performs a test 
procedure for a first one of the surface voxels n of the 
initial voxel space VS to determine whether it should be 
removed, retained or sub-divided and then performs 
further processing in accordance with that determination 
so that the voxel is removed, retained or sub-divided and 
the sub-voxels subjected to further processing as will be 
described below. 
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At step S22/ the CPU 22 repeats ^ procedure 
°f step S21 for the remaining surface voxeis until each 
of the surface voxels of the initial voxel space has been 
processed in accordance with step S2 i. 

The CPU 22 then determines at step S23 whether any 
voxel or sub-vox.1 has been removed and, if the answer ± . 

resets its counters at step S24 so as to enable 
steps S21 and s22 to be repeated ^ remaining 

voxels. steps S21 and S22 are repeated until the answer 
at step S23 is no. The reason for repeating the voxel 
sweep effected by steps S21 and S22 when voxels have been 
removed is that the removal of a voxel or sub-voxel may 
cause voxels that were previously completely occluded by 

other voxels or sub-voxels *-o „ 

voxeis *° become non-occluded or 

partially non-occluded at least fnr «= rt ™^ ■ 

ieast r ° r some images and may 

also cause voxels or sub-voxels that were previously 
hidden by other voxels or sub-voxels from certain of the 
images to be protectable into those images. Thus, the 
removal of a voxel or sub-voxel may effect the photo- 
consistency of the remaining voxels and sub-voxels. 

This technique means that each surface voxel is 
checked against each i mage in each voxel sweep. The 
images in which a voxel is visible will, however, be ^ 
least partly determined by the geometric arrangement of 
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the earner, positions at which the images were recorded. 
It thus should be possible to determine from these camera 
positions that certain surface voxels will not be visible 
or will not be visible in sufficient images to enable 
their photoconsistency to be checked, where this can be 
determined, then the voxel colouring process may be 
repeated for another set of camera positions, if 
available, to enable the photoconsistency of those 
surface voxels to be checked. Thus, at step S2 5, the CPU 
22 will determine whether there is another set of camera 
positions that should be considered. When the answer at 
step S25 is yes, then the CPU 22 will repeat at step S2 6 
steps S21 to 25 for the next set of camera positions 
until all sets of camera positions have been considered. 

Figure 14a shows in greater detail the test 
procedure for a voxel carried out at step S2 1 in Figure 
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At step S210 in Figure 14a, the CPU 22 tests the 
voxel against each of the images in turn to determine 
whether the voxel should be retained or sub-divided. The 
CPU 2 2 then checks at step S211 whether the result of the 
test at step S210 was that the voxel should be retained. 
If the answer is no, then at step S212 the CPU subjects 
the voxel to sub-division and further processing as will 
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be described in detail belt 

If the answer at step Mil i. yeS/ then the CJ?U ^ 
*-ts, *t 4*-P "13, the consisted bet Ween promotions 

°' SamS V ° Xel ±nt ° the t images and then 

checks at step S214 whether ^ result ^ ^ ^ ^ 

that the images were consistent. when *-ho = 

wnen the answer at 

step S214 is y es , then th cp . 

41 retains the voxel at 

step S217. 

If the answer at step S214 is no, then the CPU 22 
checks at step S216 whether the result of ^ ^ ^ 

step s 213 was that the voxel should be removed and if so 

removes the voxel at step S217 r f thfl 

p if the answer at step 

S216 is no then the cpu 22 ^ ^ 

described above so that the voxel is subjected to sub- 
division on further processing. 

Figure 14b shows step S210 in greater detail. At 
S tep S40, the CPU 22 tests to see whether a surface voxel 
(1) Project, into an image; (2) is occluded in respect of 
that i.nage; or (3, is partially occluded with respect to 
that image and should be sub-divided. 

The CPU 22 then checks at step S4! whether the 
answer at step S40 was that the voxel was occluded with 
respect to that image and. I£ so , the cpu 22 
that image for that voxel at step S42 and determi, 
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that, on the basis of that image, the voxel should be 
retained at sso. If, however, the answer at step S41 ±. 
no, then the CPU 22 checks to see whether the answer at 
step S40 was that the voxel was partially occluded with 
respect to that image (step S43, . if the answer at stfip 
S43 is yes, then the CPU 22 checks at step S44 whether 
the current voxel size is the minimum allowable and if 
the answer is yes decides at step S45 th a t that image 
should be ignored for that voxel and that, on the basis 
of the image, the voxel should be retained. if the 
answer at step S44 is no, then the CPU 22 determines at 
step S46 that the voxel should be sub-divided. 

If the answer at step S43 is no, then in step S4 7 
the CPU projects each of the eight corners of the voxel 
under test into the i ma g e to identify the pixel patch 
corresponding to that voxel. F ig Ure 13 shows 

3chematicall y an array of pixels P 0 , D to P a , n of part of 
an linage x*. to illustrate the projection of a voxel to 
a pixel patch Q (shown as a hatched area). The CPU 22 
then determines at step S4 8 the colour of that pixel 
patch (for example q i„ Figure 13) . wher< ^ as ^ 
Figure 13, the boundary of the pixel patch cuts through 
Pixels (such as pixel P 6 „ in Figure 13) ^ ent±rety Qf 
these pixels is considered to fall within the pixel 
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patch. The CPU 22 determines the colour of the pixel 
patch by summing the respective nu^ers (each between 
zero and 2S5 for each colour in thi s example, associated 
in its memory with the different pixels forming the patch 
and dividing that sum by the number of P i xe is in the 
Pixel patch to determine the colour (where all the pixels 
are the same colour) or the average colour of the pixel 
patch. This colour is then stored in the memory 22a by 
the CPU 22 for that voxel and that image m. 

The CPU 22 then checks at step S49 whether the 
variance of the colours of the pixels in the patch 
exceeds a predetermined threshold, for example whether 
the standard deviation in colour is greater than 10. if 
the answer is yes, then the CPU 22 determines that that 
image contains too much colour variation and that that 
image cannot be used for checking the photoconsistency of 
that voxel without sub-division of the voxel. The CPU 22 
then determines at step S44 whether the voxel size is 
already at a minimum. If the answer is yes, the CPU 22 
determines at step S45 that that image should be ignored 
for the voxel and that the voxel should, as far as that 
image is concerned, be retained at step S50. if the 
answer is no, then the CPU determines at step S46 that 
the voxel should be sub-divided. 



10 



15 



20 



37 26410S0 

At step S51 in Figure 14b the CPU 2 2 repeats steps 
S40 to S50 for each of the available images and, at step 
S52 checks to see whether a decision was taken at step 
S46 to sub-divide the voxel with respect to any one or 
more of the images. If the answer at step SS2 is yes, 
then the CPU 22 confirms at step S53 that the voxel is to 
be sub-divided. if, however, the answer at step S52 is 
no, then the CPU 22 determines at step S5 4 that the voxel 
should be retained. 

Figure 14c shows in greater detail the steps carried 
out at step S40 in Figure 14b. Thus, at step S401, the 
CPU 22 defines a straight line passing through the centre 
of the voxel and the focal point F of the camera 
position which produced the image for which the voxel is 
being tested. Figure 10a shows a voxel V x being 
projected into the image IM* along the line xx. 

The CPU 22 then checks at step S402 whether any 
other voxels lie on the line between the voxel under test 
and the focal point F. If the answer is no, then the CPU 
22 determines that the voxel is not occluded for that 
image at step S403. If, however, the answer at step S402 
if yes, then the CPU 22 checks the information in its 
memory 22a to determine, at step S404, whether the voxel 
lying on the line between the voxel being tested and the 
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focal point F is a voxel that has been sub-divided, that 
is, as will be described below whether the information 
in the CPU's memory 2 2a includes information marking the 
voxel on the line as being partially full. If the answer 
at step S404 is yes, then the CPU 22 determines at step 
S406 that the voxel under test is partially occluded fo 
that image. If the answer at step S404 is no, then th 
CPU 2 2 determines that the voxel under test is completely 
occluded for that image at step S405. The information as 
to whether the voxel under test is occluded, partially 
occluded or not occluded in that image is stored in the 
memory 22a. 

Figure 14d shows in greater detail step S213 of 
Figure 14A. Thus, at step S510, the CPU 22 checks to see 
15 whether the voxel under test projects into two or more 

images. If the answer is no, the CPU 22 determines that 
the consistency of the voxel cannot be checked and 
assumes that the voxel is consistent at step SS20. If, 
however, the answer is yes, then at step S530 the CPU 22 
compares the colour values of the pixel patches Q for 
each of the images in which the voxel was visible and 
determines whether the colour difference between the 
patches is greater than or equal to a first predetermined 
threshold A c TB1 by determining whether the standard 
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deviation of the colour values exceeds a first 
predetermined value. Typically, the predetermined value 
for the standard deviation may be 20. Any technique may 
be used to determine the standard deviation. If the 
colour difference between the patches exceeds aC th1 , then 
the CPU 22 determines at step S54 0 that the voxel is 
inconsistent and removes it at step S540, If, however, 
the answer at step S530 is no, then the CPU 22 checks at 
step S540 whether the colour difference is less than or 
equal to a second predetermined threshold AC TH2 smaller 
than the first predetermined threshold. In this example 
the second predetermined threshold is a standard 
deviation of 10. if the answer at step S550 is yes the 
standard deviation is equal to or smaller than the second 
predetermined threshold then the CPU 2 2 determines at 
step S520 that the voxel is consistent and should be 
retained. If the answer at step S550 is no, then the CPU 
22 checks at step S5 60 whether the voxel size is already 
at a minimum and, if so, decides that the voxel should be 
removed at step S540. Otherwise the CPU 22 determines 
that the voxel should be sub-divided (step S570). Thus, 
if the pixel patches into which the voxel projects have 
a colour variation greater than or equal to the first 
threshold the CPU 22 determines that that voxel cannot 
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possibly form part of the 3D object surface because its 
colour is too inconsistent between images. If however 
the colour variation between the pixel patches is less 
than the first predetermined threshold but greater than 
the second predetermined threshold AC TH2 then the CPU 22 
determines that the photoconsistency check is not 
conclusive and that the voxel should be sub-divided as 
part of the voxel may form part of the surface. 

Figure 15 shows a flow chart illustrating in greater 
detail the processing carried out step S212 in Figure 
14A. Thus, at step S2 60 in Figure 15 the CPU 22 adds to 
its memory 22a information marking the original voxel as 
partially full and retains that voxel to enable the 
testing described above with reference to Figure 12 to be 
carried out for subsequent voxels. At step S261, the CPU 
22 sub-divides the voxel into a set of subsidiary voxels r 
sub-voxels. Figure 11 shows a voxel V x that has been 
divided into eight subsidiary voxels of which sub-voxels 
VI to V6 are visible in Figure 11. It will, however, be 
appreciated that the CPU 22 may, for example, divide the 
voxel into 16 or more sub-voxels. 

Once the CPU 22 has stored the sub-voxels and their 
location in its memory 22a the CPU performs the test 
procedure described above with reference to step S21 in 
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Figure 12 for a first one of the sub-voxels to determine 
whether it should be removed, retained or sub-divided at 
step S262 and then, at step S263, repeats that test 
procedure for each of the other sub-voxels of that voxel. 
It will, of course, be appreciated that the test 
procedure at step S262 is carried out in the manner 
described above with reference to Figures 12 to 14d with 
the exception that, of course, it is a sub-voxel rather 
than a voxel that is being tested. 

Figure 16 shows diagrammatical ly a portion of the 
part of the part of the image shown in Figure 13 to 
illustrate the projection of a sub-voxel into a pixel 
patch QS in an image. 

As will be appreciated from Figures 12 to 14d if a 
sub-voxel is found to be partially occluded (that is a 
correspondingly sized sub-voxel which has already been 
divided into further subsidiary voxels is on the line 
between that sub-voxel and the focal point for the image 
concerned, ) or the colour variance of the patch into 
which the sub-voxel projects in an image exceeds the 
predetermined threshold or the colours of the patches 
into which the sub-voxel projects are inconsistent, then 
that sub-voxel may itself be sub-divided- However, 
before a sub-division is carried out, the CPU 2 2 checks 
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an step S44 in Figure 14b or step S5 6 in Figure 14d 
whether the minimum voxel size has been reached and if so 
determines that the minimum size sub-voxel should be 
removed rather than sub-divided. The minimum size may be 
determined in dependence on the resolution of the images 
being considered and may, for example, be the size of a 
sub- voxel that projects to a single pixel in an image. 

Thus, in this method, when the CPU 22 determines 
that a voxel (for example voxel V x in Figure 11) is 
partially occluded, projects to a pixel patch having too 
large a colour variance or the colour difference between 
the pixel patches is too great/ the CPU 22 does not 
immediately remove that voxel but rather subdivides that 
voxel into subsidiary voxels (eight in the example given 
above) and then tests each of those sub-voxels in turn in 
the same way as the voxels were tested* Any consistent 
sub-voxels are retained whereas, if a sub-voxel is 
determined to be photo-inconsistent, the CPU 22 checks 
whether the minimum sub-voxel size has been reached and, 
if so, removes the sub-voxel. If not, the CPU 2 2 further 
sub-divides the sub-voxel and repeats the 
photoconsistency check: for each further sub-divided 
voxel . 

In the example described above with reference to 
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Figures 12 to 15, the CPU 22 performs step S21 in 
Figure 12 by first checking whether a voxel is occluded, 
partially occluded or unoccluded (step S14 in Figure 14b) 
and, if the voxel is unoccluded, goes on to check the 
colour variance (step S49 in Figure 14b) . These two 
tests could, however, be combined so that, for example, 
the CPU 22 checks to see if the voxel is fully occluded 
and, if not, then checks the colour variance (step S49 in 
Figure 14b) and, if the colour variance does not exceed 
the predetermined threshold, only then checks to see if 
the voxel is partially occluded. 

Also, the photoconsistency check described with 
reference to Figure 14d may be combined with these other 
checks so that, for example, the CPU 22 checks first to 
see if the voxel is visible in at least two of the images 
then carries out the photoconsistency check and then 
carries out the colour variance test (step S49 in Figure 
14b) and the partial -occlusion test only if the 
photoconsistency test is satisfactory. As another 
possibility, the partial-occlusion test may be carried 
out before the photoconsistency test. Also, step S53 of 
Figure 14d could be omitted so that the CPU 22 only tests 
to see if the colour difference is less than or equal to 
the second predetermined threshold and, if the answer is 
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no, sub-divides the voxel if it has not already reached 
the minimum size. This would mean that there was no 
upper threshold beyond which the voxel was considered 
definitely to be inconsistent with the 3D object surface. 
Although this may further reduce the possibility of a 
voxel being erroneously removed it would, as will be 
appreciated, increase the number of voxels that have to 
be sub-divided and therefore the overall processing time 
required . 

It will, of course, be appreciated that the first 
and second predetermined thresholds may be user 
adjustable so as to enable a user to adjust these 
thresholds in accordance with the 3D object whose surface 
is being generated. The colour variance threshold may 
similarly be adjusted. 

The method described with reference to Figures 12 to 
15 thus enables the process of determining the 
photoconsistency of a voxel to be further refined by, 
when it is not clear whether a voxel forms part of the 3D 
object surface, sub-dividing that voxel into subsidiary 
voxels (sub-voxels) and then testing the sub-voxels for 
consistency with the 3D object surface. This should 
avoid or at least reduce the possibility of erroneous 
removal of a voxel when, for example, the colour patch 
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into which -that voxel projects in an image contains 
significantly different colours or a voxel is partially 
occluded from an image. The fact that a voxel can be 
sub-divided and the sub-voxels tested before making any 
decision to remove that voxel means that it is not 
necessary for the initial size of the voxels to be 
determined by the smallest colour area in the 3D object 
surface to be generated. Rather , the initial voxel size 
can be, for example, determined by the overall colouring 
of the images being used and need only be made smaller 
(sub-divided) where required, that is where the images 
have rapidly changing areas of colour such as may, for 
example occur at edges or highly patterned areas of the 
surface- This means that the voxel colouring process 
avoids or reduces the possibility of erroneous removal of 
a voxel due to significant colour changes within the 
colour patches into which that voxel projects without 
having to define the initial size of the voxels as being 
equivalent to the minimum single colour area in the 
images. This therefore should reduce the computational 
power and time required to generate the 3D object 
surface. 

In the above described embodiment a sub-voxel has 
the same shape as the voxels and the photo inconsistency 
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threshold is the same for the voxels as it ± s f ar sub- 
voxels. Th i s need not< however? necessarily be the case 
and there may be advantages to having sub-voxels of 
different shape from the voxels and to using different 
photo inconsistency thresholds for voxels and sub-voxels. 

Figure 17 shows a flowchart illustrating another 
example of a subdivision and further processing procedure 
that may be carried out at step S212 in Figure 14a. 

When the additional processing shown in Figure 17 is 
carried out, steps S260 and S261 are carried out as for 
the additional processing shown in Figure 15. 

When the voxel has been divided into sub-voxels at 
step S261, a first sub-voxel i is projected into a pixel 
patch in a first image m (for example the pixel patch QS 
in Figure 16) at step S264 in Figure 17 by projecting 
each corner of the sub-voxel into the image along the 
line passing through that corner and the focal point of 
the image. At step S265, the CPU 22 determines and 
stores the colour of the pixel patch for that sub-voxel 
and that image and then, at step S266, checks whether 
» = M (that is whether that sub-voxel has been projected 
into each of the available images. if the answer is no, 
then the CPU 22 increments M by i at step S267 a „d 
repeats steps S264 to S266 until the answer at step S266 
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is yes). When the answer at step S266 is yes, that is a 
sub-voxel has been projected into all of the images, the 
CPU 22 determines at step S271 whether each of the sub- 
voxels into which the voxel has been divided has been 
projected into the images (that is whether i = I? ) • if 
the answer at step S271 is no, then the CPU 22 increments 
i by 1 at step S27 2 and then repeats steps S261 to S26 7, 
S271 and S272 until the answer at step S271 is yes- When 
the answer at step S271 is yes, the CPU 22 will have 
determined and stored for each sub-voxel the colour of 
the pixel patches associated with that sub-voxel, it 
will, of course, be appreciated that the order in which 
steps S261 to S267, S271 and S272 are carried out may be 
altered so that each sub-voxel is projected into an image 
and then the step of projecting the sub-voxels is 
repeated image by image . 

When the answer at step S271 is yes, the CPU 2 2 
compares at step S273 the determined colours of the pixel 
patches for the voxel being considered. Then, at step 
S274, the CPU 2 2 determines whether there is, for that 
voxel, a set of pixel patches consisting of a pixel patch 
for each image for which the colour difference is £AC TH . 
Thus, the CPU 22 does not check whether there is 
photoconsistency between corresponding sub-voxels but 
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rather whether there is photoconsistency between pixel 
patches from the different images regardless of which 
sub-voxel projects into that pixel patch- If the answer 
at st.ep S2 74 is no there is no such set of pixel patches, 
then the CPU 2 2 removes the entire voxel at step S2 75. 
If, however, the answer at step S274 is yes, then the 
entire voxel is retained at step S2 76. 

Figure 18 illustrates another way of carrying out 
the voxel colouring process that replaces step S21 
described above with reference to Figures 12 to 15. 

At step S60 in Figure 18, the CPU 22 allocates each 
pixel of each image to be used for the voxel colouring 
process to a quantum of a quantized colour space and 
stores a quantized colour map for each image. Any 
appropriate conventional colour space may be used. in 
this example, as shown schematically in Figure 19, the 
colour space is a cubic RGB colour space in which the 
origin (0,0,0) represents black (K) while the corners of 
the cube along the x, y and z axes represent red (R), 
green (G) and blue (B) , respectively. In this example, 
the colour space shown in Figure 19 is quantized by 
dividing the colour cube into a set of smaller cubes* 
Figure 2 0 shows one plane of the colour cube to 
illustrate this division. As shown in Figure 20, each 
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side of the colour cube is divided by eight so that the 
colour space is divided into 512 quanta- Figure 20 shows 
the quanta QU as abutting one another and not 
overlapping/ The quantized colour map is stored for each 
image so that, instead of being represented by the 
original RGB value, each pixel is represented by a number 
identifying the corresponding quantum. 

At step S61, the CPU projects voxel n into a pixel 
patch in image m and stores a quantized colour map for 
the patch. This is carried out in the manner shown in 
Figure 14c except that the CPU 2 2 tests only to see 
whether the voxel is fully occluded or unoccluded, that 
is steps S404 and S406 of Figure 14c are omitted. This 
quantized colour map will indicate the frequency of 
15 occurrence of each colour quantum in that pixel patch. 

Of course, the quantized colour map may be compressed for 
a particular pixel patch so that only the portion of the 
colour space containing quanta present in that pixel 
patch is stored. Thus, for example, where the colours of 
the pixel patch all fall within the plane shown in Figure 
20, then only that portion of the colour space will be 
stored as the quantized colour map. The quantized colour 
map may be stored in tabular form as shown in Figure 20 
with each quantum indicating whether, and if so how many 
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times, a colour quantum appears in a pixel patch. For 
example, Figure 20 shows some of the colour quanta 
associated with numbers indicating the frequency of 
occurrence of those quanta in a pixel patch. As another 
possibility, the quantized colour map may be stored as a 
histogram. 

It will be appreciated that the assigning of the 
pixels to respective colour quanta could be carried out 
after a voxel has been projected into an image so only 
pixels to which a voxel projects are assigned to colour 
quanta . 

The CPU 22 then checks if all of the images have 
been checked (m = M) at step S62 and, if not, increments 
M by one at step S63 and repeats steps S61 to S63 until 
the answer at step S62 is yes. The CPU 22 then 
determines if the voxel projects into two or more images 
(step S64). if the answer is no, the CPU determines that 
the photoconsistency cannot be checked and retains the 
voxel at step S65. When the answer at step S64 is yes , 
the CPU 22 compares, at step S66, the quantized colour 
maps for the pixel patches for the images into which the 
voxel projects. The CPU 22 then determines at step S67 
whether the quantized colour maps share at least one 
quantized colour. If the answer is no, then the CPU 
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determines that the voxel is photo-inconsistent and 
removes it at step S68. If, however, the answer is yes, 
then the CPU retains that voxel at step S65. Steps S22 
to S2 6 are then carried out as described above with 
reference to Figure 12 at step S69. 

The methods described above with reference to 
Figures 15, 17 and 18 enable the voxel colouring process 
to take account of voxels that project to occluding 
boundaries or to areas of high spatial frequency so that 
the voxel does not project to an area of constant colour. 
The method described with reference to Figure 15 enables 
such voxels to be sub-divided and the individual sub- 
voxels to be checked while the methods described above 
with reference to Figures 17 and 18 err on the side of 
caution so that if there is at least some correspondence 
in colour between parts of the different pixel patches 
associated with a voxel, that voxel is retained. This 
should avoid or at least reduce the possibility of 
catastrophic failure of the voxel colouring process 
resulting from erroneous removal of a voxel that actually 
forms part of the 3D object surface but projects to an 
occluding boundary or area of high spatial frequency. 

Another method for defining the 3D object surface 
once the initial voxel space has been defined will now be 
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described with reference -to Figures 21 to 22b. 

At step S300 in Figure 21, the CPU 22 selects a 
first set of images for use in the voxel colouring 
process. This first set of images will consist of a sub- 
set of the images used to determine the initial voxel 
space . 

Typically, the first set of images will consist of 
up to 20 to 3 0 images taken at different positions and 
orientations around the object. 

At step S301, the CPU 22 performs a voxel colouring 
process using the first set of images as described above 
with reference to Figures 12a and 12b or Figures 12a and 
12b as modified by Figure 15 or 17, or Figure 18. 

At the end of this voxel colouring process, the 
CPU 22 stores at step S302 the current voxel space 
together with the determined colour for each 
photoeonsistent non-occluded voxel of the current colour 
space. At step S303 the CPU 22 selects another ijnage 
from the stored images, that is an image not in the first 
set of images, and at step S301a the CPU 22 performs the 
voxel colouring process using the current voxel space and 
the new image as will be described in greater detail 
below with reference to Figures 22a and 22b. At step 
S304, the CPU determines whether the voxel colouring 
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process converged to a reasonable 3D object surface. 
This determination may be effected by the CPU 22 causing 
the 3D object surface to be displayed to the user on the 
display 24 together with a message saying "Please confirm 
acceptance of the 3D object surface" so that the user can 
determine whether the voxel colouring process has 
proceeded satisfactorily or whether erroneous removal of 
voxels has resulted in an erroneous 3D object surface. 
Alternatively, the CPU 22 itself may determine roughly 
whether the 3D object surface is acceptable by using the 
data regarding the volume of the object that may 
previously have been input by the user. In this case, 
the CPU 2 2 would determine that the 3D object surface is 
not acceptable if the volume bounded by that 3D object 
surface is less than the expected volume of the object. 

When the answer at step S304 is no, then at step 
S305 the CPU 22 increases the allowable colour difference 
used in the voxel colouring process and repeats steps 
S301a, S304 and S30S until the CPU determines at step 
S304 that the 3D object surface is acceptable. This 
repetition of the voxel colouring process is possible 
because the voxel space that resulted from the previous 
voxel colouring process is stored at step S302 and the 
image data for the new image added for the current voxel 
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colouring process is stored at step S303 and is not 
discarded until the answer at step S304 is yes. This 
method thus enables a user to return to the previously 
determined voxel space if the voxel colouring process 
5 carried out at step S301a results in erroneous removal of 

one or more voxels or even catastrophic failure of the 
voxel colouring process . 

When the answer at step S3 04 is yes, then the CPU 22 
stores the newly derived voxel space as the current voxel 
1 10 Space -together with the determined colour for each 

^ photoconsi stent non-occluded voxel and discards the 

% previously stored image at step S306 and then checks at 

y ste P S307 whether there is another image available. 

L Step S3 0 7 may be carried out automatically by the 

CPU 2 2 where a large number of images have been pre- 
stored. The images may be selected by the CPU in any 
predetermined order. For example, the images may be 
successive images along a predetermined path around the 
object. As another possibility, the first set of images 
may consist of images taken at predetermined intervals or 
angles relative to one another around the object and the 
next images may be intermediate those images and so on. 

As another possibility at step S307, the CPU 22 may 
allow the user a choice in the next image selected. For 
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example, the CPU 29 may display a message to the user 
requesting the user to select one of a number of 
additional pre-stored images and may also give the user 
the opportunity to input data for further images (for 
example via a removable disc 27, as a signal over the 
interface I or using a digital camera) - In this way, the 
user can view the results of the previous voxel colouring 
process and determine whether it would improve the 3D 
object surface if data from one or more additional images 
was also used in the voxel colouring process - 

Steps S303 to S307 are repeated until the answer at 
step S307 is no, that is no more images are available. 

Figures 22a and 22b illustrate in greater detail the 
step S301a of Figure 21 of performing a voxel colouring 
process using the current voxel space and a new image. 

At step S221, the voxel n is projected into a pixel 
patch in the new image in the manner described above with 
reference to Figure 14. If the voxel n does not project 
into the new image then as described with reference to 
Figure 14, the CPU 2 2 proceeds to point c which is step 
S22 8 in Figure 22a and if all the non occluded voxels of 
the current voxel volume have not yet been projected into 
the new image, increments n by 1 at step S229 and then 
repeats step S221. When the voxel does project into the 



new image, the CPU 22 determines at step S223 the colour 
of the pixel patch and stores this colour in association 
with the voxel n for the new image in its memory 22a. 
The step S22 3 of determining the pixel patch colour is 
carried out in the same manner as described above with 
reference to Figure 12a - 

At step S224, the CPU 22 compares the colour of the 
pixel patch for the new image with the stored colour 
associated with that voxel in the current voxel space. 
The CPU then checks at step S225 whether the colour 
difference is less than or equal to the predetermined 
threshold AC TU - If the answer is no, the voxel is 
removed at step S226 while if the answer is yes the voxel 
is retained at step S227. The CPU then determines at 
step S228 whether all the non-occluded voxels of the 
current voxel space have been visited and if the answer 
is no increments n by 1 at step S229 and then repeats 
steps S221 to S229 until the answer at step S228 is yes. 

When the answer at step S228 is yes, the CPU 22 
determines at step S230 that the voxel sweep has been 
completed (that is all non— occluded voxels have been 
visited). The CPU then checks at step S231 whether any 
voxels have been removed in the sweep and if the answer 
is yes resets n and m for the remaining voxels at step 
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S2 3 2 and, for the reasons given above, repeats steps 
S221a to S232 until the answer at step S231 is no. When 
the answer at step S231 is no, the CPU 2 2 determines 
whether there are any other sets of camera positions to 
be considered at step S223 and if the answer is yes 
repeats at step S234 steps S221a to S234 until all of the 
sets of cameras have been considered - 

As will be appreciated from the above, the steps set 
out in Figures 22a and 22b are carried out each time a 
new image is added and the photoconsistency of that new 
image is compared with the stored results of the previous 
voxel colouring process- This means that it is only 
necessary to store in the CPU's working memory 22a the 
current voxel space, the colour associated with each non- 
occluded voxel of that space and the current image. This 
also means that the 3D object surface resulting from the 
voxel colouring process can be refined as required by the 
user simply by requesting the CPU 22 to check the 
photoconsistency of the existing voxel volume against 
another image at step S307 in Figure 21. 

Figures 23, 24a and 24b illustrate another method 
for defining the 3D object surface once the initial voxel 
space has been defined. Figure 2 3 corresponds to Figure 
21 while Figures 24a and 24b correspond to Figures 22a 
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and 22b. 

The method shown in Figures 23, 24a and 24b differs 
from that described above with reference to Figures 21 to 
2 2 in that, in this case, a number of previous images are 
retained in addition to the new image and the voxel 
colouring process is repeated using the current voxel 
space, the stored previous images and the new images. 
The number of previous images used will be considerably 
less than that used as the first set of images and may 
be, for example, 10- The number of previously stored 
images is kept constant so that, each time a new image is 
added, the oldest of the previously stored images is 
discarded. Where images of the first set still remain, 
then the image to be discarded (that is the "oldest" 
image) will be selected at random from that first set. 
Once all of the first set of images have been discarded, 
then the oldest image can be determined by looking at the 
time at which that image was added. 

As can be seen from Figure 23, in this method steps 
S3 0 0 to S3 0 2 are carried out in the same manner as 
described above with reference to Figure 21. However, at 
step S303a, instead of just storing the new image in 
place of the previous images, the CPU 22 stores the new 
image together with x (in this example 10) of the 
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previously used images and discards all other images . 

The voxel colouring process is then carried out at 
step S301b using the current voxel space and the new set 
of images (that is the new image and the previous 10 
5 images). Steps S304 to S307 are then carried out as 

described above with reference to Figure 21. 

In the method shown in Figure 23 , the voxel 
colouring process carried out at step S301 is the same as 
that described above with reference to Figures 12 and 14 
10 or Figures 12a and 12b when modified by Figure 17 or 18 

or Figure 18- 

The voxel colouring process carried out at step 
S3 01b differs somewhat from that described above with 
reference to Figures 22a and 22b as can be seen from 
15 Figures 24a and 24b. Thus, at step 5221a in Figure 24a, 

the CPU 22 projects voxel n into a pixel patch in a first 
one of the new set of images in the manner described 
above with reference to Figure 14. 

The CPU 22 then determines and stores the colour of 
20 the pixel patch at step S222a in the manner described 

above and at step S22 3a the CPU 22 determines whether the 
voxel n has been projected into each of the new set of 
images. If the answer at step S2 23a is no, then the CPU 
22 projects voxel n into the next one of the new set of 



images at step S223b in the manner described above with 
reference to Figure 14. When the answer at step S2 2 3a is 
yes, the CPU 22 determines at step S224a whether the 
voxel n projects into at least one of the new set of 
images. If the answer is no, then the CPU 22 determines 
that it is not possible to check the photoconsistency of 
that voxel in this particular voxel colouring process and 
so retains that voxel at step S227 (Figure 24b). If the 
answer at step S224a is yes r then the CPU 22 compares r at 
step S22 4a, the colours of the pixel patches for the ones 
of the new set of images into which the voxel n projects 
and the colour associated with that voxel in the current 
voxel volume- The CPU 22 then determines at step S22 5 
whether the difference in colour between the pixel 
patches and the colour associated with that voxel in the 
current voxel volume is less than or equal to AC Tf3 , If 
the answer at step S225 is no, then the voxel is removed 
at step S2 26 while if the answer is yes the voxel is 
retained at step S227. Steps S228 to S234 are then 
carried out as described above with reference to Figures 
22a and 22b* 

The method described above with reference to Figures 
2 3 to 2 4b requires a larger amount of data to be stored 
than the method described with reference to Figures 21 to 
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2 2b. However, "the storage of the additional ones of the 
previous images means that less image information is lost 
and allows the photoconsistency of the surf ace voxels of 
the current voxel volume to be checked again with each of 
these images in combination with the new image. In 
contrast, the method described with reference to Figures 
21 to 2 2 requires less storage of data but only enables 
the new image to be checked against the currently decided 
voxel space. 

Figure 25 shows a top plan view corresponding to 
Figure 8 but part way into a voxel colouring process (so 
that some voxels have already been removed) to illustrate 
the effect of adding camera positions. The initial 
camera positions A to D are represented in Figure 2 5 by 
the corresponding focal points F* to F D and the imaging 
areas IM A to IM D while additional camera positions K to H 
are represented in Figure 2 5 by the focal points F E to F H 
and the imaging areas IM B to 1Mb- 

The effect of adding the four additional camera 
positions E to H will now be described for the four 
voxels VA to VD shown coloured black in Figure 25. Thus, 
voxel VA is visible at only one of the original four 
camera positions, that is camera position B, because 
intervening voxels occlude voxel VA as far as the other 



three camera positions A, C and D are concerned. For 
example, voxel VX amongst others occludes voxel VA from 
camera position C- Similarly, voxel VD is visible only 
at camera position B of the four original camera 
positions while voxel VB is visible at camera positions 
c and D and voxel VC is visible at camera position A. 
Thus, when only the four camera positions A to D are 
provided, it is not possible to determine the 
photoconsistency of voxels VA and VC because they are 
only visible at a single camera position. In contrast, 
when the additional four camera positions E to H are 
added, voxel VA becomes visible at camera positions B r E 
and F while voxel VC becomes visible at camera positions 
D w G and H enabling the photoconsistency of these two 
voxels to be checked . Voxel VD is visible at two of the 
four original camera positions and so its 
photoconsistency can be checked without the additional 
camera positions- However, when the additional camera 
positions are added, voxel VD also becomes visible at 
camera position E so that the voxel VD is visible from 
three camera positions which should enable a more 
accurate determination as to whether the voxel VD forms 
part of the 3D object surface or not- Similarly, voxel 
VB which was visible at two of the original camera 



positions C and D becomes visible at four camera 
positions B, c, F and G when the four additional camera 
positions are added which should again enable greater 
accuracy in determining whether or not the voxel forms 
par~t of the 3D object surface. 

In the arrangement shown in Figure 25, the 
additional camera positions are provided intermediate the 
original four camera positions. A further additional 
camera position may, for example, be provided looking 
directly down onto the top of the object- The manner in 
which additional camera positions are added may be 
determined by the CPU 22 in accordance with a pre- stored 
algorithm. For example/ as shown in Figure 25, each set 
of additional camera positions may add a camera position 
intermediate each pair of adjacent camera positions. 
Alternatively or additionally , the addition of camera 
positions may be under the control of the user so that, 
for example, at step S3 07 in Figures 21 and 23, the user 
determines the selection of the additional image (and 
thus the camera position) on the basis of the current 
estimate of the 3D object surface. This enables the user 
to add additional camera positions at the points where he 
can see from visual inspection of the estimated 3D object 
surface that further information is required so as to 



better define the 3D object surface. 

As can be seen, the likelihood of a voxel that is 
not actually on the surface of the 3D object being 
erroneously retained will reduce with increase in the 
number of images used. Thus, the methods described above 
enable further refinement of the generated 3D object 
surface so as to bring it into closer agreement with the 
actual 3D object surface without significantly increasing 
the amount of data that needs to be stored at any one 
time by the main processing unit. 

As described above, a single new image is added for 
each successive voxel colouring process. However, 
instead of adding a single new image, a set of new images 
may be added. Thus, for example, images recorded at all 
or subsets of the additional camera positions shown in 
Figure 25a may be added simultaneously at step S3 03 in 
Figure 21 and step S303a in Figure 2 3 and the further 
voxel colouring processes of steps S301a and S301b 
carried out using all simultaneously added new images . 

In the embodiment described with reference to 
Figures 2 3 to 2 4b, where a set of previous images are 
retained for carrying out the further voxel colouring 
process, the set of previous images may consist simply of 
the last used x images or may consist of images that are 
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strategically important in the voxel colouring process. 
These images may be selected by the user. Thus, for 
example, at step S303a in Figure 23, the CPU 22 may 
display to the user on display 25 a message requesting 
5 the user to select from the currently stored images the 

images to be retained for the next voxel colouring 
process. 

It will be appreciated that the initial voxel space 
defining process described above with reference to 

10 Figures 3 to 11 may be used with the voxel colouring 

process described with reference to Figures 12a and 12b, 
or Figures 12a and 12b as modified by Figure 17 or Figure 
18 or the voxel colouring process as described above with 
reference to Figures 18 and 12b or any conventional voxel 

15 colouring process- Similarly f the iterative voxel 

colouring processes described above with reference to 
Figures 21 to 2 2a or 2 3 to 24 may be used in combination 
with the modifications described above with reference to 
Figures 15, 17 and 18, 

2 0 The voxel colouring processes described above with 

reference to Figures 12a, 12b and 15, Figures 12a, 12b 
and 17 or Figures 18 and 12 b may be used where the 
initial voxel space is defined in the manner described in 
the aforementioned University of Rochester Computer 
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Sciences Technical Report or any other conventional 
process for defining the initial voxel space, for example 
by setting the j nitial voxel space as a volume known by 
a user to be sufficiently large to encompass the object 
whose 3D surface is Lo be generated. similarly, the 
iterative voxel colouring processes described above with 
reference to Figures 21 to 2 2b or Figures 23 to 2 4 may be 
used with such known initial voxel space defining 
techniques. The initial voxel space or resulting 3D 
object surface data may be downloaded onto a storage 
medium such as a disc or supplied as a signal over, for 
example, a network. 

Once the 3D object surface has been generated and 
stored by the CPU in the mass-storage system 25, then, if 
desired or required, the texture data generation 
module 14 shown in Figure 1 may be used to generate 
texture data from the input image data showing the object 
for rendering the 3D object surface produced as described 
above. The texture data generation module may form part 
of the same image processing apparatus or may be provided 
by a separate image processing apparatus to which the 3D 
object surface data is downloaded from a storage medium 
or supplied as a signal. 
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It will, of course, be appreciated that the focal length 
of a camera may be so long that, in practice, the viewing 
cone of the camera can be represented by a viewing volume 
in which the rays defining the viewing volume are 
parallel or substantially parallel to one another. 

The present application incorporates by cross-reference 
the full contents of the following applications of the 
assignee which are being filed simultaneously herewith: 

Attorney reference CFP1793US (2636550) which claims 
priority from UK applications 9927876.4, 9927875-6, 
0019081.9 and 0019122-1. 

- Attorney reference CFF1796US (264 1950) which claims 
priority from UK applications 9927906.9, 9927907,7, 
9927909. 3, 0019080-1, 0019087.6 and 0019086.8- 

- Attorney reference CFP1800US (2635850) which claims 
priority from UK applications 0001300 . 3 , 0001479 - 5 , 
0018492.9, 0019120.5, 0019082-7 and 0019089.2- 



68 



ANNEX A 

1 CORNER DETECTION 

1 . 1 Summar y 

This process described below calculates corner points, to sub-pixel accuracy, from a 
single grey scale or colour image, it does this by first detecti ng edge boundaries in the 
image and then choosing corner points to be points where a strong edge changes 
direction rapidly. The method is based on the facet model of comer detection, 
described in Haralick and Shapiro*. 

1.2 Algorithm 

The algorithm has four stages; 

(1) Create grey scale image (if necessary); 

(2) Calculate edge strengths and directions; 

(3) Calculate edge boundaries; 

(4) Calculate comer points, 

1.2.1 Cre ate prey sc^q image 

The corner detection method works on grey scale images. For colour images, the 
colour values arc first converted to floating point grey scale values using the formula: 

grey_scale = (0.3 * red)+(Q.59 * green) +(0 .1 1 * blue) 

...A-l 

This is the standard definition of brightness as defined by NTSC and described in 
Foley and van Dam". 
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1-2.2 Calculate edffe strengths and directions 

The edge strengths and directions are calculated using the 7x7 integrated directional 
derivative gradient operator discussed in section 8.9 of Haralick and Shapiro 1 . 

5 

The row and column forms of the derivative operator are both applied to each pixel 
in the grey scale image. The results are combined in the standard way to calculate the 
edge strength and edge direction at each pixel. 

10 The output of this part of the algorithm is a complete derivative image. 

1.2.3 Calculate edge boundaries 

The edge boundaries are calculated by using a zero crossing edge detection method 
1 5 based on a set of 5*5 kernels describing a bivariate cubic fit to the neighbourhood of 

each pixel. 

The edge boundary detection method places an edge at all pixels which are close to 
a negatively sloped zero crossing of the second directional derivative taken in the 
2 0 direction of the gradient, where the derivatives are defined using the bivariate cubic 

fit to the grey level surface. The subpixel location of the zero crossing is also stored 
along with the pixel location. 

The method of edge boundary detection is described in more detail in section 8.8.4 
25 of Haralick and Shapiro*. 

12.4 Calculate comer prtinrK 
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The corner points are calculated using a method which uses the edge boundaries 
calculated in the previous step. 
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Corners are associated with two conditions: 

(1) the occurrence of an edge boundary; and 

(2) significant changes in edge direction. 

Each of the pixels on the edge boundary is tested for "corncrness ,, by considering two 
points equidistant to it along the tangent direction. If the change in the edge direction 
is greater than a given threshold then the point is labelled as a corner. This step is 
described in section 8.10. 1 of Haralick and Shapiro 4 . 

Finally the comers are sorted on the product of the edge strength magnitude and the 
change of edge direction. The top 200 corners which are separated by at least 5 
pixels are output. 

15 2. FRATTfRF TRACKING 

2. 1 Summary 

This process described below tracks feature points (typically corners) across a 
sequence of grey scale or colour images. 

20 

The tracking method uses a constant image velocity Kalman filter to predict the 
motion of the corners, and a correlation based matcher to make the measurements of 
corner correspondences. 

2 5 The method assumes that the motion of corners is smooth enough across the sequence 

of input images that a constant velocity Kalman filter is useful, and that comer 
measurements and motion can be modelled by gaussians. 



30 



71 2641QS0 

2.2 Algorithm 

1) Input comers from an image. 

2) Predict forward using Kalman filter. 

5 3) If the position uncertainty of the predicted corner is greater than a threshold, 

A s as measured by the state positional variance, drop the corner from the list 
of currently tracked corners. 

4) Input a new image from the sequence. 

5) For each of the currently tracked corners: 

10 a) search a window in the new image for pixels which match the corner; 

b) update the corresponding Kalman filter, using any new observations 
(i.e. matches). 

6) Input the corners from the new image as new points to be tracked (first, 
filtering them to remove any which are too close to existing tracked points) 

15 7) Go back to (2) 

2,2.1 Prediction 



20 



25 



This uses the following standard Kalman filter equations for prediction, assuming a 
constant velocity and random uniform gaussian acceleration model for the dynamics; 



... A-2 
....A-3 

where X is the 4D state of the system, (defined by the position and velocity vector of 
the corner), K is the state covariance matrix, 0 is the transition matrix, and Q is the 
process covariance matrix. 
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In this model, the transition matrix and process covariancc matrix are constant and 
have the following values; 



e 
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2.2.2 Searching and matching 
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This uses the positional uncertainty (given by the top two diagonal elements of the 
state covariance matrix, K) to define a region in which to search for new 
measurements (i.e. a range gate). 



The range gate is a rectangular region of dimensions: 



20 



...A-6 
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The correlation score between a window around the previously measured corner and 
each of the pixels in the range gate is calculated. 

The two top correlation scores are kept. 
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If the top correlation score is larger than a threshold, C 0i and the difference between 
the two top correlation scores is larger than a threshold, AC, then the pixel with the 
top correlation score is kept as the latest measurement. 
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2.2.3 I Jpdate 



The measurement is used to update the Kalman Filter in the standard way: 

G = KH T (JIKH T +RT { A . ? 
X-X+GQC-HX) A 8 

...A-9 



where G is the Kalman gain, H is the measurement matrix, and R is the measurement 
10 covariance matrix. 

In this implementation, the measurement matrix and measurement covariance matrix 
are both constant, being given by: 
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- e o) .,..a-io 

R = ... A-ll 



2.2.4 Parameters 
2 o The parameters of the algorithm are: 



Initial conditions: Xo and Kq. 
Process velocity variance: o v 2 
Measurement variance: a 1 . 

Position uncertainty threshold for loss of track: A. 
Covariance threshold: C 0 . 
Matching ambiguity threshold: AC 
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For the initial conditions, the position of the first coiner measurement and zero 
velocity are used, with an initial covariance matrix of the form: 



0 

K ° ' 1 0 4 I) 

0 ; .A-12 



a 0 2 is set to ov =• 200(pixels/frame) 2 . 

The algorithm's behaviour over a long sequence is anyway not too dependent on the 
initial conditions. 



The process velocity variance is set to the Fixed value of 50 (pixels/frame) 2 The 
process velocity variance would have to be increased above this for a hand-held 
1 5 sequence. In fact it is straightforward to obtain a reasonable value for the process 

velocity variance adaptively. 

The measurement variance is obtained from the following model: 

a 2 = (rK+a) ...A-13 



where K = J(K U K*J is a measure of the positional uncertainty, V is a parameter 
related to the likelihood of obtaining an outlier, and "a" is a parameter related to the 
25 measurement uncertainty of inliers. V and "a" are set to r=0.1 and a=l.O. 

This model takes into account, in a heuristic way. the fact that it is more likely that 
an outlier will be obtained if the range gate is large. 
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The measurement variance (in fact the full measurement covariance matrix R) could 
also be obtained from the behaviour of the auto-correlation in the neighbourhood of 
the measurement. However this would not take into account the likelihood of 
obtaining an outlier. 

The remaining parameters are set to the values. A=400 pixels 2 , C 0 =0.9 and AO0.001 . 

3 ^ D STIRFAC P rpNPRATTON 

3.1 Architecture 

In the method described below, it is assumed that the object can be segmented from 
the background in a set of images completely surrounding the object. Although this 
restricts thegenerality of the method, this constraint can often be arranged in practice, 
particularly for small objects. 

The method consists of five processes, which are run consecutively: 

First, for all the images in which the camera positions and orientations have 
been calculated, the object is segmented from the background, using colour 
information. This produces a set of binary images, where the pixels are 
marked as being either object or background. 

The segmentations are used, together with the camera positions and 
orientations, to generate a voxel carving, consisting of a 3D grid of voxels 
enclosing the object. Each of the voxels is marked as being either object or 
empty space. 

The voxel carving is turned into a 3D surface triangulation. using a standard 
triangulation algorithm (marching cubes). 
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The number of triangles is reduced substantially by passing the triangulation 
through a decimation process. 

Finally the triangulation is textured, using appropriate parts of the original 
images to provide the texturing on the triangles. 



3.2 Segmentation 

The aim of this process is to segment an object (in front of a reasonably homogeneous 
coloured background) in an image using colour information. The resulting binary 
image is used in voxel carving. 

Two alternative methods are used: 

Method 1: input a single RGB colour value representing the background 
colour - each RGB pixel in the image is examined and if the Euclidean 
distance to the background colour (in RGB space) is less than a specified 
threshold the pixel is labelled as background (BLACK). 

Method 2: input a "blue" image containing a representative region of the 
background. 



The algorithm has two stages; 

( 1 ) Build a hash tabic of quantised background colours 

(2) Use the table to segment each image. 



Step 1 ) Build hash table 



Go through each RGB pixel, "p", in the "blue" background image. 
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Set "q" to be a quantised version of "p" Explicitly: 



...A-14 



where "t" is a threshold determining how near RGB values need to be to background 
colours to be labelled as background. 

The quantisation step has two effects; 

1) reducing the number of RGB pixel values, thus increasing the efficiency ol 
hashing; 

2) defi ning the threshold for how close an RGB pixel has to be to a background 
colour pixel to be labelled as background. 



added to a hash table (if not already in the table) using the (integer) hashing 



That is, the 3 least significant bits of each colour field are used. This function is 
chosen to try and spread out the data into the available bins. Ideally each bin in the 
hash table has a small number of colour entries. Each quantised colour RGB triple 
is only added once to the table (the frequency of a value is irrelevant). 



q is now i 
function. 



h(q) = (q_red & 7)*2~6+(q_green <fe 7)*2~3+(qJblue & 7) 



...A-15 
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Step 2) Segment each image 



Go through each RGB pixel, ,r v M , in each image. 



3 o Set "w" to be the quantised version of "v" as before. 
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To decide whether n w" is in the hash table, explicitly look at all the entries in the bin 
with index h(w) and see if any of them are the same as ,r w M . If yes, then "v" is a 
background pixel - set the corresponding pixel in the output image to BLACK. If no 
then "v" is a foreground pixel - set the corresponding pixel in the output image to 
WHITE. 

Post processing: for both methods a post process is performed to fill small holes and 
remove small isolated regions. 

A median filter is used with a circular window. (A circular window is chosen to avoid 
biasing the result in the x or y directions.) 

Build a circular mask of radius "r". Explicitly store the start and end values for each 
scan line on the circle. 

Go through each pixel in the binary image. 

Place the centre of the mask on the current pixel. Count the number of BLACK 
pixels and the number of WHITE pixels in the circular region. 

If (#WHITE pixels * #BLACK pixels) then set corresponding output pixel to 
WHITE. Otherwise output pixel is BLACK. 

3.3. Voxel carving 

The aim of this process is to produce a 3D voxel grid, enclosing the object, with each 
of the voxels marked as either object or empty space. 



The input to the algorithm is: 



a set of binary segmentation images, each of which is associated with a camera 
position and orientation, 

2 sets of 3D co-ordinates, (xmin, ymin. zmin) and (xmax, ymax, zmax), 
describing the opposite vertices of a cube surrounding the object; 

a parameter, "n", giving the number of voxels required in the voxel grid. 

A pre-processing step calculates a suitable size for the voxels (they are cubes) and the 
3D locations of the voxels, using "n". (xmin. ymin, zmin) and (xmax, ymax, zmax). 

Then, for each of the voxels in the grid, the mid-point of the voxel cube is projected 
into each of the segmentation images. If the projected point falls onto a pixel which 
is marked as background, on any of the images, then the corresponding voxel 
marked as empty space, otherwise it is marked as belonging to the obj 



is 
ect. 



Voxel carving is described further in "Rapid Octree Construction from Image 
Sequences" by R. Szeliski in CVGIP: image Understanding, Volume 58, Number 1, 
July 1993, pages 23-32. 

3-4 Marching cuheg 

The. aim of the process is to produce a surface triangulation from a set of samples of 
an implicit function representing the surface (for instance a signed distance function). 
In the case where the implicit function has been obtained from a voxel carve, the 
implicit function takes the value - 1 for samples which are inside the object and +1 for 
samples which are outside the object. 
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Marching cubes is an algorithm that takes a set of samples of an implicit surface (e.g. 
a signed distance function) sampled at regular intervals on a voxel grid, and extracts 
a triangulated surface mesh. Lorensen and Cline* and Bloomentahr v give details on 
the algorithm and its implementation. 

The marching-cubes algorithm constructs a surface mesh by "marching" around the 
cuhes while following the zero crossings of the implicit surface f(x)=0, adding to the 
triangulation as it goes. The signed distance allows the marching-cubes algorithm to 
interpolate the location of the surface with higher accuracy than the resolution of the 
volume grid. The marching cubes algorithm can be used as a continuation method 
(i.e it finds an initial surface point and extends the surface from this point). 

3.5 Decimation 

The aim of the process is to reduce the number of triangles in the model, making the 
model more compact and therefore easier to load and render in real time. 

The process reads in a triangular mesh and then randomly removes each vertex to see 
if the vertex contributes to the shape of the surface or not. (i e. if the hole is filled, is 
the vertex a "long" way from the filled hole). Vertices which do not contribute to the 
shape are kept out of the triangulation This results in fewer vertices (and hence 
triangles) in the final model. 



The algorithm is described below in pseudo-code. 

25 

INPUT 

Read in vertices 

Read in triples of vertex IDs making up triangles 
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OSC 



PROCESSING 

Repeat NVERTEX times 

Choose a random verier, V t which hasn't been chosen before 
Locate net of all triangles ha\>ing V as a vertex, S 
Order S so adjacent triangles are next to each other 
Re-triangulate triangle set. ignoring Vfi.e. remove selected triangles 
& V and then fill in hale) 

Find the maximum distance between V and the plane of each triangle 
If (distance < threshold) 



OUTPUT 

Output list of kept vertices 
Output updated list of triangles 

The process therefore combines adjacent triangles in the model produced by the 
marching cubes algorithm, if this can be done without introducing large errors into the 
model. 

The selection of the vertices is carried out in a random order in order to avoid the 
effect of gradually eroding a large part of the surface by consecutively removing 
neighbouring vertices. 

3 6 Further Surfa ce Generatjpn Techniq ues 




Discard V and keep new triangulation 



Else 



Keep Vand return to old triangulation 




Further techniques which may be employed to generate a 3D computer model of an 
object surface include voxel colouring, for example as described in "Photorealistic 
Scene Reconstruction by Voxel Coloring" by SeitzandDyerin Proc. Conf Computer 
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Vision and Pattern Recognition 1997, pi067-1073, "Plenoptic image Editing" by 
Seitz and Kutulakos in Proc. 6th International Conference on Computer Vision, pp 
1 7-24, "What Do N Photographs Tell Us About 3D Shape?" by Kutulakos and Seitz 
in University of Rochester Computer Sciences Technical Report 680, January 1 998, 
and "A Theory of Shape by Space Carving" by Kutulakos and Seitz in University of 
Rochester Computer Sciences Technical Report 692, May 1998. 

4 TgXTUftftsTQ 

The aim of the process is to texture each surface polygon (typically a triangle) with 
the most appropriate image texture. The output of the process is a VRML model of 
the surface, complete with texture co-ordinates. 

The triangle having the largest projected area is a good triangle to use for texturing, 
as it is the triangle for which the texture will appear at highest resolution. 

A good approximation to the triangle with the largest projected area, under the 
assumption that there is no substantial difference in scale between the different 
images, can be obtained in the following way. 

For each surface triangle, the image "i" i s found such that the tri angle is the most front 
facing (i.e. having the greatest value for h,»v where n t is the triangle normal and 
v • is the viewing direction for the 'Tth camera). The vertices of the projected 
triangle are then used as texture co-ordinates in the resulting VRML model. 

This technique can fail where there is a substantial amount of self-occlusion 7 or 
several objects occluding each other. This is because the technique does not take into 
account the fact that the object may occlude the selected triangle. However, in 
practice this does not appear to be much of a problem. 
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It has been found that, if every image is used for texturing then this can result in very 
large VRML models being produced. These can be cumbersome to load and render 
in real time. Therefore, in practice, a subset of images is used to texture the model. 
This subset may be specified in a configuration file. 



• # 
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